A very low bit rate speech coder using HMM-based speech recognition/synthesis techniques
نویسندگان
چکیده
This paper presents a very low bit rate speech coder based on HMM (Hidden Markov Model). The encoder carries out phoneme recognition, and transmits phoneme indexes, state durations and pitch information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indexes, and a sequence of mel-cepstral coefficient vectors is generated from the concatenated HMM by using an ML-based speech parameter generation technique. Finally we obtain synthetic speech by exciting the MLSA (Mel Log Spectrum Approximation) filter, whose coefficients are given by mel-cepstral coefficients, according to the pitch information. A subjective listening test shows that the performance of the proposed coder at about 150 bit/s (for the test data including 26 % silence region) is comparable to a VQ-based vocoder at 400 bit/s (= 8 bit/frame 50 frame/s) without pitch quantization for both coders.
منابع مشابه
Improving the performance of HMM-based very low bit rate speech coding
In this paper, we define an F0 quantization scheme for a very low bit rate speech coder based on HMM (Hidden Markov Model). In the coding system, the encoder carries out phoneme recognition, and transmits phoneme indices, state durations and F0 information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indices, and a sequence of mel-cepstral coefficient v...
متن کاملSyllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture
Current HMM-based low bit rate speech coding systems work with phonetic vocoders. Pitch contour coding (on frame or phoneme level) is usually fairly orthogonal to other speech coding parameters. We make an assumption in our work that the speech signal contains supra-segmental cues. Hence, we present encoding of the pitch on the syllable level, used in the framework of a recognition/synthesis sp...
متن کاملA very low bit rate speech coder using HMM with speaker adaptation
This paper describes a speaker adaptation technique for a phonetic vocoder based on HMM. In the vocoder, the encoder performs phoneme recognition and transmits phoneme indexes and state durations to the decoder, and the decoder synthesizes speech using HMM-based speech synthesis technique. One of the main problems of this vocoder is that the voice characteristics of synthetic speech depend on H...
متن کاملDynamic Unit Selection for Very Low Bit Rate Coding at 500 bits/sec
This paper presents a new unit selection process for Very Low Bit Rate speech encoding around 500 bits/sec. The encoding is based on speech recognition and speech synthesis technologies. The aim of this approach is to use at best the speech corpus of the speaker. The proposed solution uses HMM modelling for the recognition of elementary speech units. The HMM are first trained in an unsupervised...
متن کاملA very low bit rate speech coder based on a recognition/synthesis paradigm
Recent studies have shown that a concatenative speech synthesis system with a large database produces more natural sounding speech. We apply this paradigm to the design of improved very low bit rate speech coders (sub 1000 b/s). The proposed speech coder consists of unit selection, prosody coding, prosody modification and waveform concatenation. The encoder selects the best unit sequence from a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998